Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COLLECTIONS-800] Adding partitionBalanced(List,int) method #265

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ClaudioConsolmagno
Copy link

@ClaudioConsolmagno ClaudioConsolmagno commented Nov 4, 2021

Returns consecutive sublists of a list, partitioned in a way to balance entries across all sublists. See COLLECTIONS-800 for more details.

Implementation Notes:

  • Given that the Partition class is a private static class not used by anything else I think it's ok to just use isBalanced as a boolean instead of, say, an enum or having a separate brand new private class for this new method.
  • Added several comments to clarify the logic, let me know if you think that's too much.
  • I didn't know how far I could go when refactoring existing tests so I only half changed them for partition() method.
  • Similarly, existing tests all around seem fairly simple but I thought I should have good coverage and test different scenarios for partitionBalanced() hence added the whole forEach test cases (I'm a fan of Spock and data driven testing).

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 90.139% when pulling 4557a45 on ClaudioConsolmagno:COLLECTIONS-800-partition-balanced into 7ab4e4f on apache:master.

// when index hasn't crossed the threshold we don't need balancing yet
if (currentPartitionSize > 1 && threshold > 0 && index >= threshold) {
start -= (index - threshold); // adjust start as partitions before threshold have one less element
currentPartitionSize--; // currentPartitionSize is decremented as threshold is crossed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newbie myself, but I wonder if that could be written more simply:

evenlyDistributedSize = ...
if (index < threshold) { // lists before threshold should contain +1
    start = index * (evenlyDistributedSize + 1);
    end = start + (evenlyDistributedSize + 1);
} else {
    start = index * evenlyDistributedSize + threshold; // offset other starting positions by threshold
    end = start + evenlyDistributedSize;
}

...or similar (Note: this is untested). The if condition and subtractions look quite complicated in my opinion.

* list, partitioned in a way to balance entries across all sublists. For example,
* partitioning a list containing {@code [a, b, c, d, e]} with a partition
* size of 3 yields {@code [[a, b, c], [d, e]]} -- an outer list containing
* two inner lists of three and two elements, all in the original order. Partitioning

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you would use the example from ListUtils.partition, however this could be confusing IMO if you start comparing what method to use and see no difference so far. I would opt for a more clear example here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants